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ABSTRACT ■ 
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variety of types of graphs and tables. When assessing relative 
probabilities, students were equally successful at answering 
questions regardless of the data display type. When making data-based 
causal inferences, accuracy decreased and students were quite 
sensitive to differences in the data display. In the causal inference 
study, data presented in percentages produced sore accurate responses 
than did data presented in frequencies? graphs elicited better 
problem-solving strategies than did contingency tables? and pie 
charts yielded the most consistently high accuracy of all the display 
types. Results support the claims that graph interpretation is 
distinct from graph decoding, and that the graph interpretation skill 
is not simply a function of the graph (or table) type, but rather is 
a complex interaction between the data display format, the type of 
problem to be solved, and the problem solver's facility with the 
reasoning underlying the particular problem type. Results therefore 
suggest that a major source of difficulty in graph and table 
interpretation lies in the translation of both the problem and the 
data display into appropriate and compatible mental representations. 
Numerous figures and tables present study findings, and there is a 
nine- item list of references. (Author/SLD) 
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Abstract 



Undergraduates solved statistical reasoning problems based on data presented in 
a variety of types of graphs and tables. When assessing relative probabilities, students 
were equally successful at answering the questions regardless of the data display type. 
When making data-based causal inferences, accuracy decreased and students were quite 
sensitive to differences in the data display. In the causal inference study data presented 
in percentages produced more accurate responses than data presented in frequencies; 
graphs elicited better problem-solving strategies than contingency tables; and pie charts 
yielded the most consistently high accuracy of all the display types. 

These results support the claims that graph interpretation is distinct from graph 
decoding, and that graph interpretation skill is not simply a function of the graph (or 
table) type, but rather is a complex interaction between the data display format, the type 
of problem to be solved and the problem-solver's facility with the reasoning underlying 
the particular problem type. These students had relatively little difficulty using tables 
and bar graphs and frequency data to solve probability problems, but had considerable 
(yet variable) difficulty using the same types of data displays when solving causal 
inference problems. The students were adept at comparing ratios in the probability 
task, but generally less successful at comparing ratios in the causal inference task. 

In order to solve these problems, people must decide which of the quantities in 
the data display are relevant to the problem at hand and how these quantities should be 
combined to solve the problem. When someone is uncertain about what's relevant, s/he 
may look to the data display to guide their problem solving. In this case, excess 
information in the data display may add to the difficulty of the problem. When 
someone is adept at a particular type of reasoning, s/he knows how to identify and 
ignore irrelevant information in the data display. 

These studies indicate that although these students can decode graphs and tables, 
can compose and compare ratios, the format of the data display influenced their ability 
to solve some problems. The results of these studies suggest that a major source of 
difficulty in graph and table interpretation for these students lies in the translation of 
both the problem and the data display into appropriate and compatible mental 
representations. 
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Objectives 



With the growing use of graphs and tables to convey information in books and 
lect»'-es, students' skill at interpreting these displays is increasingly important to their 
education. Media reliance on graphical representations of statistical information 
indicates that these skills have lifelong value. Graph interpretation is an essential 
component of statistical problem solving skill. 

Greeno (1987) and Schoenfeld (1986, 1987) have argued that instructional 
representations influence students' acquisition of concepts. In the context of statistical 
reasoning, both inside the classroom and out, the manner in which quantitative 
information is presented may also affect people's ability to use mathematical concepts. 
In order to solve such problems, both the problem statement and the relevant 
information in the graphical display must be translated into appropriate and compatible 
mental representations before one can begin to apply algebraic or statistical procedures. 
The ability to appropriately translate or interpret a graphical display depends partly on 
one's translation of the problem statement and, under some circumstances, the 
translation of the problem statement may be influenced by the nature of the available 
data 

The goal of this research is to examine how the organization of information 
affects the way that information is interpreted. The studies reported here explore the 
translation problem in graph comprehension and identify some of the factors 
influencing success or failure at translating the graphical information display into 
mental models that support problem solving. 

Perspectives 

Previous research on graph interpretation discussed elementary graphical 
perception processes and memory processes in graph comprehension. However the 
-aphs used in these studies typically do not contain data: The areas on the graph do 
i ^present barrels of oil or incidence of malaria, they are simply unlabeled shapes on 
the page. This work has emphasized the discriminability of symbols and perception of 
the sizes of areas on graphs. Summarizing this body of research, Cleveland (1985) 
classified the perceptual tasks in visual decoding of a graph, but the issue of using 
graphs to communicate information or using graphs to solve problems remains to be 



addressed. Visual decoding is a necessary component of graph interpretation, but it 
isn't sufficient One graphical display may reveal a variety of interesting relationships 
among the data; however, but skills alone won't tell readers which of those 
relationships is pertinent to their interests. 

Interpreting graphically presented information requires more than abstract graph 
reading skills, or the ability to locate various pieces of information in a given type of 
information display. Problem solving based on data in graphs and tables also requires 
that the reader know what pieces of information are needed to solve the problem. 
Accessing an appropriate mental model is one way of "deciding" what information to 
seek. This requires the reader to select or construct an appropriate mental model based 
on the information available in memory, in the problem and, possibly, in the graph as 
well. 

Problem solving with tabular or graphically presented information also requires 
a translation of the problem and the data into terms consistent with the mental model. 
The translation is analogous to the process used for arithmetic story problem solving, in 
which a narrative about Chris and Pat and the number of marbles they own is 
translated into the equation 3 + 2 = ? , if the question is about the total number of 
marbles, or 3 - 2 = ? , if the question is about the difference between the numbers of 
marbles owned by each child. Thus, the translation of a given type of graph, e.g. a bar 
graph, will differ according to the kind of question being asked about the data in the 
graph. 

Building on the work of Pinker (1981) and Kosslyn (1985, 1989), McKnight and 
Fisher (1991) have developed a theory of graph comprehension with a variety of 
memory processes. These include accessing an appropriate mental model for the 
graphical stimulus and problem situation;. This mental model guides further attention 
and perception to pull information from the graph. Finally, this information is fit to the 
mental model to serve as the core of the representation needed for cognitive processing 
to accomplish the task. This model suggests that graph comprehension will vary 
depending on the reader's choice of an appropriate mental model for the graphical 
stimulus and problem situation. The suitability of the mental model will determine its 
usefulness at guiding the extraction of information from the graph. 
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A General Model of Data Based Prob lem Solving 



First, translate the problem statement and the data display into compatible 
mental representations, and choose a strategy for solving the problem. This 
includes deterauning which pieces of information are needed and how they'll be 
combined. 

Next, decode the graph or table ~ seek out those pieces of information - and 
then combine the information in accordance with the strategy chosen. 

Finally, ^anslate the results back into the terms of the original problem 
statement. 

The present studies explore the translation problem in graph comprehension by 
investigating the relationship between problem solving and type of graphical display 
for two kinds of statistical reasoning problems. The probability judgement is one for 
which students are likely to possess good mental models; the less familiar causal 
judgement is likely to require construction of a mental model. 



Method 

Undergraduates in introductory psychology classes participated voluntarily by 
completing pencil and paper tasks in a group testing situation. Each student received 
several problems of one type, all illustrated with the same kind of graph or table. 
Statistical information was represented in either contingency tables, bar graphs or pie 
charts: some were based on frequency data and some on percentage data. Students 
were arcigned randomly to problem type and display type. Study 1 included 151 
students aswering probability questions. Study 2 included 253 students answering 
questions about cause. 
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Study 1 Probability Judgement 

The probability questions asked about populations on each of three islands, for 
example: 

Homeowners Renters 

Island D 1000 500 

Island E 250 750 

Island F 1250 1250 

You are trying to sell renter's insurance and homeowner's 
insurance by dialing phone numbers selected at random from 
each island's telephone directory. 

On which island is a single call most likely to contact a renter? 

In the probability judgement condition, there were four display types: two kinds 
of frequency tables (with and without marginal totals) and two kinds of bar graphs 
(stacked and side-by-side). Approximately 40 students responded to each display type, 
and each student answered the same four questions. Examples of the four data display 
types used in this study are shown on the following page. 

In order to base the probability judgment on covariation information, an 
individual must compose a ratio of renters to total population for each island, and then 
determine which of the three ratios is the largest An alternative (faulty) strategy might 
involve selecting the island with the greatest absolute number of renters. This strategy 
would be less computationally demanding and might be chosen by someone who did 
not understand the principle of random sampling. 

Students solved the probability problems regardless of the information display. 
Accuracy ranged from 81% correct for frequency tables with marginal totals, to 89% 
correct for stacked bar graphs. There was no effect of display type on the frequency of 
correct answers ($o,i47) - -7142, p = .5450). These results suggest that most of the 
students were able to access (or construct) an adequate mental model, translate the 
problem statment, extract the relevant information from any of the display types, 
compare the ratios, and reach a correct conclusion. 

6 



o 

ERIC 



7 



Homeowners Renters 



island D 1000 500 

island E 250 750 

Island F 1250 1250 



You are trying to sell renter's Insurance and homeowner's 
Insurance by dialing phone numbers selected at random 
from each Island's telephone directory. 

On which island Is a single call most likely to contact a 
renter? 




You are trying to sell renter's Insurance and homeowner's insurance 
by dialing phone numbers selected at random from each Island's 
telephone directory. 



On which island is a single caii most likely to contact a renter? 



Homeowners Rentero Total 



island D iooo soo isoo 

Island E 250 750 1000 

Island F 1250 1250 2500 

Total 2500 2500 5000 



You are trying to setl renter's Insurance and homeowner's 
insurance by dialing phone numbers selected at random 
from each island's telephone directory. 

On which Island is a single call most likely to contact a 
renter? 



Number of 
People 




island Island Islstd 
0 £ F 



B Renter 
§3 Homeowner 



You are trying to sell renter's Insurance and homeowner's Insuranc 
by dialing phone numbers selected at random from each island's 
telephone directory. 

On which Island is a single call most likely to contact a renter? — 



Study 2 Causal Inference 



The causal judgement study (Dibble & Shaklee, 1991) asked for a causal inference 
about the effect of sun (or shade) on leaf spots, for example: 



Spacemen decided to collect information to discover the effect 
of shade on leaf spots for eight different kinds of space plants. 
For each type of plant, the spacemen put some plants in the 
shade and some in the sun. Each problem below shows what 
happened to the leaves of those plants in each lighting 
condition. 

Plant name: HIX 



Considering the information shown, what should the spacemen 
conclude about the effect of sun on leaf spots for hix plants? 
(Circle one) 

A. Sun causes spots on leaves. 

B. Sun prevents spots on leaves. 

C. Sun has no effect on leaf spots. 



In the causal task, there were five display types: two kinds of tables (containing 
frequencies or percentages), two kinds of stacked bar graphs (containing frequencies or 
percentages), and pie charts. There were approximately 60 students in each of the five 
groups. Each student judged eight problems, including four pairs of comparable 
positive and negative relationships between sun and leaf spots. 

For one problem pair, the strategy diagnostic problems, judgements would be 
inaccurate if based on cause-present outcomes alone. In order to base the causal 
judgment on covariation information, an individual needs to compose a ratio of the 
likelihood of leaf spots on plants in the sun (cause present outcomes), and compare it to 
the likelihood of leaf spots in the shade (cause absent outcomes). The problem pairs are 
described in the table below: 



plants in 



plants in 
Shade 
89% 
11% 



Have Spots 
No Spots 



Sun 
50% 
50% 
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Problem Types 



DIAGNOSTIC 

Cause Cause 

Present Absent 

Sun Shade 

Spots 50% 89% 

No Spots 50% 11% 



1189 



Cause 
Present 



Spots 
No Spots 



Sun 
11% 



Cause 
Absent 

Shade 

89% 
11% 



Spots 
No Spots 



Cause 
Present 

Sun 

50% 
50% 



Cause 
Absent 

Shade 

11% 
89% 



Cause 
Present 



Spots 
No Spots 



Sun 

89% 
11% 



Cause 
Absent 

Shade 

11% 
89% 



33_72 



89_50 



Spots 
No Spots 



Cause 
Present 

Sun 

33% 
67% 



Cause 
Absent 

Shade 

72% 
28% 



Cause 
Present 

Sun 

Spots 89% 
No Spots 11% 



Cause 
Absent 

Shade 

50% 

50% 



Spots 
No Spots 



Cause 

Present 

Sun 

72% 



Shade 

33% 
67% 



Cause Cause 
Absent Present 

Sun 

Spots 11% 
No Spots 89% 



Cause 

Absent 

Shade 

50% 
50% 



Examples of the five data display types used in the causal inference study are 
the next page, with graphs of the results on the following page. 
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Study 2 

Examples of display types 



Spacemen aeooea to u*»u*u*o«*« — «•—«*< 

s^ts to dghtdif&atstt Wnds ciEpsi«pl««&. For each type of plant, the 
spaoanm put seme Fl«K in the shade wdscniffi in ths sun. Each problem 



condition. 



Considering the information shown, what should the spacssnen conclude about 
the effect of sun on leal spots tehU plants? (Circle one) 

Plan t Name: HDC 



SHADE 



SUN 




A. Sun causes spots on leaves. 
8. Sun prevents spots on leaves. 
C Sun has no effect on leaf spots. 



Table (%) 



No Spots 



Sen §b*&t 



Table Frequency 



NeSpott 



pUnnta 
Son 

9 
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Bar Graph (%) 



Bar Graph Frequency 





51W SMAK 



sua SkaQC 
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Biir Graphs m Tables 



100 j 

9u ■■ 




20 
10 
0 - 



DIAGNOSTIC 33.72 11_89 89.50 

FraHesH Type 



Proportion vs. Frequency 




20 ■ 
10 
0 - 



DIAGNOSTIC 33.72 11.89 89.50 

Problem Type 



Fie Charts vs. Bar Graphs (%) 




29 ■■ 
10 

e - 



DIAGNOSTIC S$J2. 11.89 89.50 

Problem Type 

ERIC 



Causal judgement accuracy was generally poor relative to accuracy in the 
probability judgement task. Causal judgement task accuracy varied and depended 
heavily on display type. Across all eight problems, accuracy was poor for tables relative 
to graphs (63% vs. 71%, F<i, 314) = 6.92, p =0.009), poor for frequency data in comparison 
with proportional data (62% vs. 72%, - 9.14, p = 0.002), and strikingly good for 

pie charts (81%). Though the pie chart results were no more accurate across all eight 
problems than the bar graphs with percentages, there was a significant interaction 
between problem type and display type for this comparison: on the strategy diagnostic 
problems, pie charts led to significantly higher accuracy than bar graphs with 
percentage data (F(i,i5p =8.71, p =0.000) 

Previous research (Shaklee & Elek, 1988) has shown that a common error in 
statistically based causal reasoning is focussing on event outcomes when the cause is 
present and ignoring them when the cause is absent, e.g. someone asked about the effect 
of sun on leaf spots might note the proportion of spotted plants in the sun and fail to 
attend to the proportion of spotted plants in the shade. For two of the causal problems, 
this strategy would lead to an error, and for these more difficult strategy diagnostic 
problems the above patterns were especially strong, ranging from 19% correct in the 
frequency table condition to 81% correct in the pie chart condition; table vs. graph: 
25% vs. 37%; frequency vs. proportion: 25% vs. 36%. In general, students receiving the 
pie charts had little difficulty solving the causal problems, with an average accuracy 
across problems of over 80%, comparable to the results in the probability judgment 
study. 

The Role of the Data Display 

Although the display types in the causal inference study were physically 
comparable to the display types in the probability judgement study, and though both 
problem types required students to compose and compare ratios, students' ability to use 
the data in the displays varied considerably between problem types and also among 
display types. An explanation for this may lie in the nature of the various translation 
problems presented by these tasks. 

In the presumably more familiar probability judgment task the students were 
insensitive to variations in the information display, whereas in the statistically based 

12 
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causal inference task they were very sensitive to such differences. In the causal study, 
the strategy diagnostic problems showed a greater effect of display type. These findings 
support the idea that students who are uncertain about how to solve the problem are 
relying on information in the graphical display to help them translate, or interpret the 
problem statement. 

Bar graphs show the data as two ratios, whereas tables require the students to 
compose those ratios for themselves. This did not matter in the probability judgement 
study, but in the causal judgement study students were significantly more accurate 
when judging bar charts rather than tables. The fact that the difference was greatest for 
the strategy diagnostic problems indicates that bar graphs tended to elicit better 
problem-solving strategies than tables. For non-diagnostic problems, faulty strategies, 
e.g. ignoring cause absent outcomes, may still lead the student to a correct conclusion. 

Causal judgements are significantly more accurate for percentage data than for 
frequencies. This suggests that composing and comparing two ratios may have been 
difficult for some students. Because improvement was as strong for the non-diagnostic 
problems as for the strategy diagnostic problems, it is unlikely that percentage data 
affected choice of problem solving strategy. Providing data in percentages may have 
reduced the computational demands for some students in the causal study. Note, 
however, that many students in the probability study successfully composed and 
compared three ratios! 

Causal judgement accuracy was greatest for students judging data in pie charts. 
This effect was especially strong for the strategy diagnostic problems. Both bar graphs 
and pie charts depict the data in ratios, yet pie charts were even more likely to elicit use 
of an improved problem solving strategy. If, in fact, the students are looking to the data 
display for clues toward translating the problem statement, the pie chart has the 
advantage of offering information about the two ratios, and virtually nothing else. The 
bar graph with percentages contains (superfluous) numbers on the Y-axis, which 
students may have attempted to incorporate in their solutions. One apparent effect of 
the pie chart was to raise students' awareness that statistical causal reasoning involves 
comparison of outcomes when the cause is present an£ absent. 

But why is it that students in the probability task can decode and interpret the 
graphs, compose and compare ratios, and yet students in the causal task have trouble 
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with seemingly similar mental operations? The results of these two studies also suggest 
that graph and table comprehension is a context dependent skill. The interpretation of 
graphical (and tabular) displays appears to be closely linked to facility with the kind of 
reasoning called for by the problem, i.e., possession of a suitable mental model. The 
process of making probability judgements, even in a statistical context, is likely to be 
more familiar to these students than the process of statistically based causal inference. 
The students are able to adequately translate the probability question and decide what 
to do. Though the majority of these students have the graph-reading sub-skills to 
decode the graphs, and the computational skills to compose and compare ratios, many 
students in the causal study apparently did not know how to bring these skills to bear 
on the problems at hand. They were unable to access a mental model adequate to guide 
their translation of the problem statements, and apparently attempted to use the data 
display to guide their translation of the problems. 

When students understand the problem and know exactly what information they 
need in order to solve it, they can locate that information and ignore other details in the 
data display. This could be the case in in the probability study, in which most of the 
students appear to be accessing a suitable mental model for representing the problem 
and the data. When the students are less certain about what the problem requires, as in 
the causal judgement task, they may look to the graphical display to find information 
that will help them select or construct an appropriate mental model. If the display is 
rich in information, as in the case of the frequency table, it isn't very useful for this 
purpose and can lead to confusion. If, however, the display contains the bare minimum 
of information needed, e.g. the pie chart, it may serve as a useful guide to translation of 
the problem statement If this analysis is correct, we would expect to see the effect of 
display type disappear for people who are adept at data-based causal inference. 

Additional Considerations 

The probability questions described above call for a familiar judgement - 
likelihood - in the reasonably familiar context of populations. The more difficult causal 
judgment was framed in a relatively less familiar context of factors influencing leaf 
disease in hypothetical space plants. This leaves open the possibility that the differences 
in sensitivity to information display type is influenced by differences in the semantic 
context of the problems, as well as by the differences in the type of reasoning called for 
by the problem. 
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The probability questions also differ from some of the causal questions with 
respect to the relative magnitude of the numbers involved in each data set. This 
difference does not exist for the causal data expressed in percentages or in pie charts. 
However, the small sample sizes in the frequency tables & frequency bar graphs may 
have influenced some students' willingness to base causal inferences on these data. 

Summary 

The function of quantitative graphics is to inform. Graphics organize 
information in ways that facilitate, impede or distort information processing. Graphs 
and tables can summarize information and call attention to certain patterns or 
relationships within the data. The match between the patterns relevant to the problem 
and the relationships emphasized by a particular display type influence whether the 
viewer is informed, confused or deceived with respect to a specific statistical issue. 
Different displays of the same information may elicit from the viewer different 
strategies for solving the same problem,. Or, if a given strategy is used consistently, 
some information displays may facilitate or impede extraction of information needed 
for that strategy. The viewer's familiarity with the statistical problem , with the 
semantic context and with the relationships within the data will also influence the 
usefulness of the different data displays. 

Some Implications for Instruction 

Decoding graphs does not equal interpreting graphs. Decoding skills are 
necessary but not sufficient for using graphs and tables to make statistical inferences. 

When someone is uncertain about what's relevant, excess information in the r? a ta 
display may add to the difficulty of the problem. When someone is adept at a particular 
type of reasoning, s/he knows how to identify and ignore irrelevant information in the 
data display. 

Where statistical reasoning is concerned, it doesn't make sense to talk about 
learning to interpret a particular type of graph, e.g. a bar graph. Graph interpretation is 
a function of the reasoning required by the problem at hand, thus we have high 
accuracy for interpreting bar graphs in the probability judgement study, and relatively 
low accuracy interpreting bar graphs in the strategy diagnostic component of the causal 
inference study. 
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